NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer

https://doi.org/10.18653/v1/2023.findings-emnlp.183

Zhang, Qingru; Ram, Dhananjay; Hawkins, Cole; Zha, Sheng; Zhao, Tuo (January 2023, Association for Computational Linguistics)

Full Text Available
Towards Compact Neural Networks via End-to-End Training: A Bayesian Tensor Approach with Automatic Rank Determination

https://doi.org/10.1137/21M1391444

Hawkins, Cole; Liu, Xing; Zhang, Zheng (March 2022, SIAM Journal on Mathematics of Data Science)

Full Text Available
General-Purpose Bayesian Tensor Learning With Automatic Rank Determination and Uncertainty Quantification

https://doi.org/10.3389/frai.2021.668353

Zhang, Kaiqi; Hawkins, Cole; Zhang, Zheng (January 2022, Frontiers in Artificial Intelligence)

A major challenge in many machine learning tasks is that the model expressive power depends on model size. Low-rank tensor methods are an efficient tool for handling the curse of dimensionality in many large-scale machine learning models. The major challenges in training a tensor learning model include how to process the high-volume data, how to determine the tensor rank automatically, and how to estimate the uncertainty of the results. While existing tensor learning focuses on a specific task, this paper proposes a generic Bayesian framework that can be employed to solve a broad class of tensor learning problems such as tensor completion, tensor regression, and tensorized neural networks. We develop a low-rank tensor prior for automatic rank determination in nonlinear problems. Our method is implemented with both stochastic gradient Hamiltonian Monte Carlo (SGHMC) and Stein Variational Gradient Descent (SVGD). We compare the automatic rank determination and uncertainty quantification of these two solvers. We demonstrate that our proposed method can determine the tensor rank automatically and can quantify the uncertainty of the obtained results. We validate our framework on tensor completion tasks and tensorized neural network training tasks.
more » « less
Full Text Available
Bayesian tensorized neural networks with automatic rank selection

https://doi.org/10.1016/j.neucom.2021.04.117

Hawkins, Cole; Zhang, Zheng (September 2021, Neurocomputing)

Tensor decomposition is an effective approach to compress over-parameterized neural networks and to enable their deployment on resource-constrained hardware platforms. However, directly applying tensor compression in the training process is a challenging task due to the difficulty of choosing a proper tensor rank. In order to address this challenge, this paper proposes a low-rank Bayesian tensorized neural network. Our Bayesian method performs automatic model compression via an adaptive tensor rank determination. We also present approaches for posterior density calculation and maximum a posteriori (MAP) estimation for the end-to-end training of our tensorized neural network. We provide experimental validation on a two-layer fully connected neural network, a 6-layer CNN and a 110-layer residual neural network where our work produces 7.4X to 137X more compact neural networks directly from the training while achieving high prediction accuracy.
more » « less
Full Text Available
3U-EdgeAI: Ultra-Low Memory Training, Ultra-Low Bitwidth Quantization, and Ultra-Low Latency Acceleration

https://doi.org/10.1145/3453688.3461738

Chen, Yao; Hawkins, Cole; Zhang, Kaiqi; Zhang, Zheng; Hao, Cong (June 2021, Great Lakes Symposium on VLSI)

Full Text Available
On-FPGA training with ultra memory reduction: A low-precision tensor method

Zhang, Kaiqi; Hawkins, Cole; Zhang, Xiyuan; Hao, Cong; Zhang, Zheng (May 2021, ICLR Workshop on Hardware Aware Efficient Training)

Various hardware accelerators have been developed for energy-efficient and real-time inference of neural networks on edge devices. However, most training is done on high-performance GPUs or servers, and the huge memory and computing costs prevent training neural networks on edge devices. This paper proposes a novel tensor-based training framework, which offers orders-of-magnitude memory reduction in the training process. We propose a novel rank-adaptive tensorized neural network model, and design a hardware-friendly low-precision algorithm to train this model. We present an FPGA accelerator to demonstrate the benefits of this training method on edge devices. Our preliminary FPGA implementation achieves 59× speedup and 123× energy reduction compared to embedded CPU, and 292× memory reduction over a standard full-size training.
more » « less
Full Text Available
Tensor Methods for Generating Compact Uncertainty Quantification and Deep Learning Models

https://doi.org/10.1109/ICCAD45719.2019.8942121

Cui, Chunfeng; Hawkins, Cole; Zhang, Zheng (November 2019, International Conference on Computer-Aided Design)

Full Text Available
Variational Bayesian Inference for Robust Streaming Tensor Factorization and Completion

https://doi.org/10.1109/ICDM.2018.00200

Zhang, Zheng; Hawkins, Cole (November 2018, IEEE International Conference on Data Mining (ICDM))

Streaming tensor factorization is a powerful tool for processing high-volume and multi-way temporal data in Internet networks, recommender systems and image/video data analysis. Existing streaming tensor factorization algorithms rely on least-squares data fitting and they do not possess a mechanism for tensor rank determination. This leaves them susceptible to outliers and vulnerable to over-fitting. This paper presents a Bayesian robust streaming tensor factorization model to identify sparse outliers, automatically determine the underlying tensor rank and accurately fit low-rank structure. We implement our model in Matlab and compare it with existing algorithms on tensor datasets generated from dynamic MRI and Internet traffic.
more » « less
Full Text Available

Search for: All records